Speech Recognition Enhancement Via Robust CHMM Speech Background Discrimination
نویسندگان
چکیده
1 Abstract This paper describes a methodology for building robust Hidden Markov Model (HMM) based speech recognition system. The ultimate goal is to build a reliable large vocabulary isolated words speech recogniser. The model, that we are dealing with, is of continuous HMM type (CHMM). The topology selected is the left-right one as it is quite successful in speech recognition due to its consistency with the natural way of articulating the spoken words. One important task here is to efficiently extract the spoken words from their background using 3 states CHMM and process them in isolation by another 9 states models. This is considered as a perceptual way of extracting the signal. This technique is substantially increasing the performance of the system and improving the incorporation of states’ duration.
منابع مشابه
Continuous HMM and Its Enhancement for Singing/Humming Query Retrieval
The use of HMM (Hidden Markov Models) for speech recognition has been successful for various applications in the past decades. However, the use of continuous HMM (CHMM) for melody recognition via acoustic input (MRAI for short), or the so-called query by singing/humming, has seldom been reported, partly due to the difference in acoustic characteristics between speech and singing/humming inputs....
متن کاملAudio - Visual Continuous Speech Recogni Markov Mode
With the increase in the computational complexity of recent computers, audio-visual speech recognition (AVSR) became an attractive research topic that can lead to a robust solution for speech recognition in noisy environments. In the audio visual continuous speech recognition system presented in this paper, the audio and visual observation sequences are integrated using a coupled hidden Markov ...
متن کاملSpeaker independent audio-visual continuous speech recognition
The increase in the number of multimedia applications that require robust speech recognition systems determined a large interest in the study of audio-visual speech recognition (AVSR) systems. The use of visual features in AVSR is justified by both the audio and visual modality of the speech generation and the need for features that are invariant to acoustic noise perturbation. The speaker inde...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999